Do Trace Cache, Value Prediction and Prefetching Improve SMT Throughput?
نویسندگان
چکیده
While trace cache, value prediction, and prefetching have been shown to be effective in the single-threaded superscalar, there has been no analysis of these techniques in a Simultaneously Multithreaded (SMT) processor. SMT brings new factors both for and against these techniques, and it is not known how these techniques would fare in SMT. We evaluate these techniques in an SMT to provide recommendations for future SMT designs. Our key contributions are: (1) we identify a fundamental interaction between the techniques and SMT’s sharing of resources among multiple threads, and (2) we quantify the impact of this interaction on SMT throughput. SMT’s sharing of the instruction storage (i.e., trace cache or i-cache), physical registers, and issue queue impacts the effectiveness of trace cache, value prediction, and prefetching, respectively.
منابع مشابه
Increasing Predictive Accuracy through Limited Prefetching
Prefetching multiple files per prediction can improve the predictive accuracy. However, it comes with the cost of using extra cache space and disk bandwidth. This paper discusses the most Recent distinct Successor (RnS) model and uses it to demonstrate the effectiveness of our earlier work, Program-based Last Successor (PLnS) model, a program-based prediction algorithm [21]. We analyze the simu...
متن کاملOptimizing SMT Processors for High Single-Thread Performance
Simultaneous Multithreading (SMT) processors achieve high processor throughput at the expense of single-thread performance. This paper investigates resource allocation policies for SMT processors that preserve, as much as possible, the single-thread performance of designated “foreground” threads, while still permitting other “background” threads to share resources. Since background threads on s...
متن کاملA Performance Study of Instruction Cache Prefetching Methods
Prefetching methods for instruction caches are studied via trace-driven simulation. The two primary methods are “fallthrough” prefetch (sometimes referred to as “one block lookahead”) and “target” prefetch. Fall-through prefetches are for sequential line accesses, and a key parameter is the distance from the end of the current line where the prefetch for the next line is initiated. Target prefe...
متن کاملDead-block prediction & dead-block correlating prefetchers
Effective data prefetching requires accurate mechanisms to predict both “which” cache blocks to prefetch and “when” to prefetch them. This paper proposes the DeadBlock Predictors (DBPs), trace-based predictors that accurately identify “when” an L1 data cache block becomes evictable or “dead”. Predicting a dead block significantly enhances prefetching lookahead and opportunity, and enables placi...
متن کاملPerformance Modeling of Memory Latency Hiding Techniques
Due to the ever-increasing computational power of contemporary microprocessors, the execution time spent on actual arithmetic computations (i.e., computations not involving slow memory operations such as cache misses) is significantly reduced. Therefore, for memory intensive workloads, it is more important to overlap multiple cache misses than to overlap slow memory operations with other comput...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006